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ABSTRACT 

Advances in observational capabilities have ushered in a new era of multi-wavelength, 
multi-physics probes of galaxy clusters and ambitious surveys are compiling large sam- 
ples of cluster candidates selected in different ways. We use a high-resolution N-body 
simulation to study how the influence of large-scale structure in and around clusters 
causes correlated signals in different physical probes and discuss some implications this 
has for multi-physics probes of clusters (e.g. richness, lensing, Compton distortion and 
velocity dispersion). 

We pay particular attention to velocity dispersions, matching galaxies to subhalos 
which are explicitly tracked in the simulation. We find that not only do halos persist 
as subhalos when they fall into a larger host, groups of subhalos retain their identity 
for long periods within larger host halos. The highly anisotropic nature of infall into 
massive clusters, and their triaxiality, translates into an anisotropic velocity ellipsoid: 
line-of-sight galaxy velocity dispersions for any individual halo show large variance de- 
pending on viewing angle. The orientation of the velocity ellipsoid is correlated with 
the large-scale structure, and thus velocity outliers correlate with outliers caused by 
projection in other probes. We quantify this orientation uncertainty and give illustra- 
tive examples. Such a large variance suggests that velocity dispersion estimators will 
work better in an ensemble sense than for any individual cluster, which may inform 
strategies for obtaining redshifts of cluster members. We similarly find that the abil- 
ity of substructure indicators to find kinematic substructures is highly viewing angle 
dependent. While groups of subhalos which merge with a larger host halo can retain 
their identity for many Gyr, they are only sporadically picked up by substructure 
indicators. 

We discuss the effects of correlated scatter on scaling relations estimated through 
stacking, both analytically and in the simulations, showing that the strong correlation 
of measures with mass and the large scatter in mass at fixed observable mitigate 
line-of-sight projections. 



1 INTRODUCTION 



Galaxy clusters form the high- mass tail of hierarchical struc- 
ture formation and are of interest for constraining cosmo- 
logical parameters, understanding large scale structure, as 
extreme environments for galaxy formation and as objects 
hosting unique astrophysical phenomena. While firs t discov- 
ered as concentrations of galaxies (lAbell 1958; Zwi ckv et al.l 
119661 ). they are now als o routine l y found as luminous , 
extended X-ray sources JSchwartj Il978l : iMcHardvl 1 19781 : 
iBohringer et al. |2000| . l200ll ) , as peaks in the shear field 
Wittman et al. 2006) and as "holes" in the microwave sky 



Staniszewski et al. 2009 ). To mitigate the systematic errors 



associated with each individual method and to provide a 
more complete understanding of clusters, multi-wavelength 
studies have become increasingly common. Each waveband 



adds knowledge about clusters. However, we might expect 
there to be significant correlations between effects in dif- 
ferent methods both because the intrinsic properties they 
measure depend on e.g. cluster size but also because they are 
similarly affected by the complex environment surrounding 
clusters. 

Roughly speaking, both the hot gas and galaxies in clus- 
ters trace the dark matter which dominates the potential. 
We can approximate the clusters as self-similar and isother- 
mal, with a temperature T oc M'^'^, a velocity dispersion 
a^ oc r, and ric hness N cc M for sufficiently massive ha- 
los (|Kaiserill986r ). Consider a small region near the cluster: 
lensing measures the sum of all of the mass, richness all of 
the mass in hal os above some threshold an d Compton (or 
SZ) distortion (|Sunvaev fc ZerdovichI [TgT^ . SZ) all of the 
mass in halos weighted by M^". The signal in each probe 
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Figure 1. The velocity field traces the filamentary large-scale 
structure. The grey scale shows the logarithm of the projected 
dark matter density in a 5/i~^Mpc thick slice around a cluster 
of mass 5 x 10^^ h~^AlQ at z = 0. The slice is oriented to con- 
tain the halo particle velocity eigenvectors with the largest {x) 
and smallest (y) eigenvalues. The dashed circle shows the region 
within r200c. Note that within the cluster the velocities trace the 
elongation of the matter, as required. In addition the velocities 
at larger distances trace the filamentary structure in the larger 
scale environment. Thus we expect that velocity anisotropy will 
be correlated with density anisotropy. 



depends on the mass within the viriahzed region of the clus- 
ter, the mass near the cluster but outside the virial region 
and (uncorrelated) mass at larger distance along the line-of- 
sight. Lensing and Compton distortion measures provide Ut- 
tle line-of-sight resolution. The degree of projection involved 
in a richness estimator depends on how well the galaxy dis- 
tances are known (e.g. using photometric or spectroscopic 
redshifts). 

Line-of-sight galaxy velocities in principle provide a 
measure of the potential well depth or mass and offer the 
possibility of breaking line-of-sight projection. However, the 
velocity field traces the density field, and can be correlated 
with line-of-sight projection due to the filamentary nature 
of mass accretion onto massive halos (see Fig. [T] which gives 
an example of this effect in our simulations). The veloci- 
ties of cl uster galajcies can retain this large scal e anisotropy 
(see also iTormenI Il997l : iKasun fc EvrardI |2005| . for studies 
of dark matter velocities). Thus it is easy to imagine that 
line-of-sight velocity dispersion could be correlated with fil- 
amentary material which can bias individual cluster mea- 
surements in e.g. richness, lensing or Compton distortion. 

We would like to investigate how the complex struc- 
ture of the cosmic web of material near clusters leads to 
correlations in individual cluster observables, and the impli- 
cations that this has for these four probes of clusters. This 
shared dependence, not only on cluster properties but also 
on cluster environment, can introduce additional subtleties 
when methods are combined. For example, an often used 



approach is to "stack" clusters on the basis of one observed 
property X (e.g. richness), and then look for correlations 
between two other properties, Y and Z. Clearly, it is very 
important to understand the joint distribution P{X, Y, Z) 
and the degree of correlation between scatter in X, Y or Z. 

In this paper we use N-body numerical simulations with 
subhalos (which we identify with galaxies) to study prop- 
erties of cluster galaxy kinematics and the relation of the 
scatters in velocity dispersion, Compton distortion, lensing, 
and optical richnes^j generated by nearby large-scale struc- 
ture. Details of our numerical simulations and methods for 
finding subhalos are given in !j2l We describe how the mock 
richness, lensing and Compton distortion observations are 
constructed in iJS] Readers interested in the results may skip 
to |4]where we discuss the intrinsic properties of our massive 
halos and their subhalo (galaxy) populations and ij5] where 
we discuss measurements of galaxy kinematics in the pres- 
ence of interlopers and ^ where we discuss the correlations 
between different observables. 

The effects of the cosmic web, and in particular projec- 
tion effects, have been long-time concerns fo r optical cluster 



finding 
lI99j: 



Abell"1958; 'Daffon ct al 1992; 'Lu insden etall 
van Haarlem et al. 1997 : White ct al 1999 )j_ measur- 



te:£. 



Shaw. Holder fc Bodel 120081) 



ing C o mpton distortion (e.g. [ White. H crnguist fc Springe] 
'200i iHolder. McCarthy fc Bab_ulM2007i : iHallman et a 

200(1; 
weak 



lensi ng maps (e.g. iReblinskv fc Bartelman: 



or interpretini 



1999i: iMetzler. White fc LokenI 120011; iHoekstral I2OOII 



de Putter fc White! l2005l : iMeneghetti et all |201GI 1. Corre- 
lations between scatt ers induced by comm o n pro j ection 
effects were noted in ICohn fc Whit3 (|2009l ). [Ceiil ()l997h 
did an early simulation study of projection on several of 
the indicators we consider here including richness, velocity 
dispersions and lensing, and measured substructure using 
dark matter particles. For cluster kinematics in particular, 
the velocity dispersion properties of dark matter particles 
and the i r rela t ion to the cosm i c we b we re studied in 
iTormenI (j 19971 '): iKasun fc EvrardI (J2005l l and iBiviano et aD 
l|2006l ) noted that filamentary inflow was expected to affect 
measured velocity dispersions. Our simulations have enough 
dynamic range that we can simulate a representative 
cosmological volume, including the neighboring large-scale 
structure and cosmic web, while simultaneously resolving 
and tracking the subhalos which we believe are galactic 
hosts. This preserves any correlations between subhalo 
properties and halo orientation or cosmic web, and coher- 
ence between subhalo populations which fell in as part of a 
group. We emphasize the effect that anisotropy in galaxy 
kinematics has on line-of-sight velocity dispersion or virial 
mass estimators of cluster mass and discuss how this scatter 
compares to (and correlates with) other measures of cluster 
mass which are sensitive to the cluster's environment. 
This forms a partial extension of the work of IStanek et al.l 
(2010), who discussed the correlations in scatters of a large 
number of different intrinsic (rather than projected) cluster 
quantities, including X-ray. 



^ We do not address X-ray emission in this paper. 
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2 SIMULATIONS 

In order to investigate the above questions with 'realistic' 
conditions we need mock galaxy, gas and lensing catalogs in 
which clusters of galaxies are placed in their correct cosmo- 
logical context, with an appropriate prescription for iden- 
tifying galaxies and for which the intrinsic cluster proper- 
ties are known. We make use of several dark-matter-only 
N-body simulations. Such simulations follow the evolution 
of large dark matter halos, which we observe as galaxy clus- 
ters, correctly accounting for their place in the filamentary 
large-scale structure and their complex formation histories. 



2.1 N-body simulation 

We make use of several simulations in this paper. The main 
one is of the ACDM family with n„ = 0.274, ^a = 0.726, 
h = 0.7, n = 0.95 and as ~ 0.8, in agreement with a wide 
array of ob servati o ns. Br iefly, we used the TreePM code de- 
scribed in IWhitd (|2002l ') to evolve 2048^ equal mass par- 
ticles in a periodic cube of side length 250/i~^Mpc. This 
results in particle masses of 1.4 x 10* h~^ Mq and a Plum- 
mer equivalent smoothing of 2.5/i~^kpc. The initial condi- 
tions were generated by displacing particles from a regu- 
lar grid using second order Lagrangian perturbation theory 
at z — 150 where the rms displacement is 38 per cent of 
the mean inter-particle spacing. The phase space data were 
dumped at 45 times, equally spaced in ln(a) from z = 10 
to 0. This TreePM code has been compared to a number of 
other codes and shown to perform well for such simulations 
iJHeitmann et al. 2008). Though we shall not highlight them 
individually, in addition to this si mulation we have mad e 
use of the simulation described in IWetzel fc White! l|2010f ) , 
which used a different subhalo finder, and four other simu- 
lations of smaller volumes focused on massive halos where 
we have mass resolution 2 — 5 times higher than in the fidu- 
cial run and comparably higher force resolution. This allows 
us to check the dependence on subhalo finding and tracking 
scheme, mass and force resolution and on limiting mass. 

For each output we found dar k matter halos us ing the 
Friends of Friends (FoF) algorithm l|Davis et all 19851 ) with a 
linking length of 0.168 times the mean interparticle spacing. 
This partitions the particles into equivalence classes roughly 
bounded by isodensity contours of 100 x the mean density. 
We keep all halos above 50 particles, and generate merger 
trees for all of the halos in the simulation so as to identify 
the times of last major mergers or other interesting events in 
the history. The center of the halo is taken to be the position 
of the most bound particle, including all of the mass in the 
Friends of Friends halo in the computation of the potential. 

Given halo centers, we also compute the spherically av- 
eraged mass profile taking into account all of the mass in 
the simulation. We follow standard convention and define 
the virial radius as that radius within which the mean den- 
sity is 200 times the critical density at the epoch of ob- 
servation, writing this r2ooc. The three dimensional veloc- 
ity dispersion of the dark matter within r2ooc is tightly 
correlated with the mass interior to the same radius as 
expected from the virial relation (e.g. lEvrard et all |2008| . 
for a recent study). The mass function of halos is approx- 
imately universal if a density contrast tied to the mean 
density and encompassing the zero-velocity surface is used 



jjenkins et all 12001 



iBhattacharva et al 



, IWhitd I2OOII : [Robertson et all l2009l : 
2010!). When appropriate we also use 



the radius within which the mean density is 180 times the 
background density, rigob, for convenience. Unless stated be- 
low, the mass quoted will be Migob- 



2.2 Subhalos/Galaxies 

In hierarchical structure formation models, such as CDM, 
the virialized regions of large dark matter halos contain sub- 
halos — self-gravitating, bound clumps of dark matter — 
which contain 0(10) per cent of the total halo mass. Lu- 
minous galaxies form via the cooling and condensation of 
baryons in the very centers of halos and subhalos so these 
subhalos identify the sites of galaxy formation in the simu- 
lation. 

We identify "subhalos" within our Friends of Friends 
halos as overdensities in phase space (see Appendix [XJ. For 
newly formed halos the "central" subhalo is defined as the 
most massive subhalo within the host. For other halos it is 
defined as the descendant of the central subhalo within the 
most massive progenitor of the host halo. The subhalo posi- 
tion is that of its most bound particle. Subhalo merger tree s 
are computed as desc ri bed in [ Wetzel. Cohn fc Whitel (|2009l ) 
and IWetzel fc Whitel (|201G| ). Briefly, subhalo histories are 
tracked across 4 consecutive output times to ensure sub- 
halos are not "lost" during close passes through the dense 
central regions of a halo. Parent-child relationships are de- 
termined using the 20 most bound particles (which we have 
found to be very stable) . For each subhalo we define Minf as 
the host halo mass it had just prior to becoming a satellite, 
i.e. the largest host halo mass for which it was the central 
subhalo. We shall use Mini as a proxy for stellar mass or 
luminosity and keep all subhalos whose infall mass is larger 
than 2 x 10^^ h~^ Mq (> 10^ particles). Resolution tests in- 
dicate t he catalogs are largely comp lete to this mass limit 
(see also iBovlan-Kolchin et al.ll2009| ). 

As discussed extensively in IWetzel fc Whitel (|201Gr ) 
there are slightly more satellites per host halo, and corre- 
spondingly more small-scale clustering power, than observa- 
tions demand. If we remove the excess bas ed on the ratio of 
insta ntaneous subhalo mass to infall mass (jWetzel fc Whitel 
[2OI0I) and match subhalos to galaxies based on abundance 
then our halo catalog is in good agreement with many ob- 
servations including the global and cluster luminosity func- 
tions, the satellite statistics and the luminosity dependent 
clustering of galaxies. 

We shall focus primarily on z ~ 0.1, discussing what 
changes as we go to higher redshift in !j7| At 2 = 0.1, 
the number density of subhalos above our mass threshold 
is 0.02/i~^Mpc^. Observationally the same number densi- 
ties are achiev ed by going down to 0.21/* or Mr = —18.5 
in the r-band ('Blanton etai] 120031 ) or about Mb = -18.6 
(Faber et al. 2007) or a stellar mass of about 3 x 10^ h~^MQ 
(|Moster et al.ll201Gl ). 

As our simulation explicitly tracks the evolution of sub- 
halos, including their complex dynamics and mass loss, we 
are in a position to ask sophisticated questions about the 
spatial and kinematic distribution of "galaxies" in clusters. 
As shown below, the subhalo spatial distribution and its 
environment dependence is in good agreement with cor- 
responding observations of galaxies. By using subhalos. 
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Figure 2. The halo occupation distribution for our clusters 
at 2 ~ 0.1 (points), compared to the SDSS group catalogs of 
lYang. Mo fc van den BoschI 1120081 . solid lines) for two different 
luminosity thresholds. Subhalo masses are converted to r-band lu- 
minosity by abundance matching, assuming a 1-1 relation (i.e. no 
scatter). The panels correspond to IgM > 11.7 and 12.2 respec- 
tively, with masses measured in h~^MQ. For each cluster the halo 
mass is defined as that interior to rigo;,, within which the mean 
matter density is 180 times the background density. The richness 
is defined as all galaxies above a phase-space density threshold, 
as in the observations. We plot 30 lines-of-sight per halo. 
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Figure 3. The (projected) subhalo profile for galaxies brighter 
than 0.4 L« (i.e. M* -|- 1), normalized at the virial radius. The 
curves show a simple singular isothermal sphere (E oc R~^) 
and the profile iLin. Mohr fc Stanford! | |2004| ) found which fits the 
counts of i^-band selected galaxies at 2 ~ 0.1. 



rather than just randomly drawing particles from within 
the halo, we ensure that we keep any correlations be- 
tween subhalo positions and th e halo orientation or its 
large-scale environment (see e.g. iFaltenbacher et al.l l2009l : 
ISiverd. Rvden fc Gaudil boiCJ . for recent reviews) and be- 
tween positions and dynamics of subhalos that fell into the 
host as part of a larger group. 

In Figure [2] we demonstrate that the halo occupa- 
tion distribution of galaxies in the simulation is in good 
agree ment with the measuremen t s from the group cata- 
log in I Yang. Mo fc van den BoschI l|2008l ). Our satellite spa- 



Figure 4. The isotropically averaged ID velocity dispersion of 
the dark matter vs. the satellite subhalos for halos with more 
than 50 members at z ~ 0.1. The points are for all particles 
and satellites within spheres of radius r"200c (crosses) and riooc 
(squares) about the most bound particle in the halo. The short- 
dashed line represents equality and the dotted (r200c) and long- 
dashed (riooc) lines are an unweighted fit to the points. 



tial distribution is slightly shallower than the dark mat- 
ter in the central regions. The (projected) profile matches 
well the NFW pro file found to fit the counts o f Tf-band 
selected galaxies by iLin. Mohr fc Stanford! l|2004 ) down to 
r ~ 0.1 r200c, see Figured 

We find evidence for mild positive velocity bias within 
the vi rial sphere (Fig, jj), in agreemen t with most previous 
work JGao et al.ll2004l:lGotdl20 05: Fah enbacher fc Diemandl 
l2006l : lLau. Nagai fc Kravtsovl201Q : iFaltenbacheill2010l ). The 



velocities of satellites appears t o be determi ned almost en- 
tirely by the hosts' potential (|Wetzelll2010l ). although we 
caution that the degree of velocity bias does depend on 
the manner in which subhalos are selected and retained, 
with more massive satellites showing reduced dispersion 
and satellites accreted more recently generally having in- 
creased dispersiorij. This may become important as we 
move to higher redshift where the mean mass of the host 
halos should decrease while the mean mass of the ha- 
los for which one could obtain accurate redshifts should 
increase, leading to a smaller mass ratio and larger ef- 
fects of dynamical friction. This is partially canceled by 
the "more recent" infall time distribution at higher red- 
shift. In summary, the exact amount of velocity bias 
will depend on how the subhalo samples are selected 
jGao ct al. 2004; Goto 2 005; Fahenbacher fc Diemand 20Qi; 
iLau. Nagai fc Kravtsovl 120101 : iFaltenbacherl 120101 ) . and can 
evolve with redshift - since it is not entirely clear how to 
match an observed galaxy sample to a particular subhalo 
sample it seems prudent to assign a 0(10) per cent theo- 
retical uncertainty in the absolute value of the velocity bias 
with a roughly comparable scatter from halo to halo. 



^ iBiviano et al.l 1I2OO3) also showed that the bias differed for 
'early-type' and 'late-type' subhalos. 
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2.3 Missing physics 

Our simulations do not attempt to model the baryonic com- 
ponent, and thus can only be an approximation to the 
full story. Fortunately for the most massive objects in the 
universe, the majority of the baryonic material is in hot 
gas, rather than cold gas or stars. The cooling of gas in 
massive clusters does not dramatically al ter the halo pro- 
file, e xcept in the very inner regions (e.g. iKazantzidis et all 
[2OOJ) Outside of these regions the spatial distribution of 
the hot gas largely follows the gravitational potential and 
we shall make this assumption where necessary. The hot 
intra-cluster medium in massive halos is expected to al- 
ter the orbits of s atellites o nly mildlv (ISimha et all 12003 : 
iLau. Nagai fc Krav tsov 201Cll: |jiang. Jing fc Linll201(]| ). since 
it is a minority mass compo nent and they are travel ing at 
close to the sound speed (e.g. lConrov fc Ostrikenl2008l ). The 
cooling of gas in the centers of our subhalos could help to 
stabilize them against disruption. Our numerical resolution 
is high enough that the relevant subhalos are not lost to 
numerical disruption in any case, and our satellite fractions 
are a t or above observational estimates (see lWetzel fc Whitel 
I2OI0I . for a compilation). The outer envelope which is lost 
to stripping is expected to be mostly dark matter, so this 
physics will be correctly modeled. Once a majority of the 
mass is lost the subhalo mass will be much less than the 
host halo mass, and the amount of dynamical friction expe- 
rienced will be small , miti g ating any error in the precise 
amount (Simha et al ' 20091 : iLau. Nagai fc Kravtsovl I2OI0I : 
Ijiang. Jing fc Lin 2010| ). Extending high dynamic range 
simulations such as ours with additional physics which is 
in accord with observational constraints would be very in- 
teresting. 



3 MOCK OBSERVATIONS 

Given the matter and subhalo distribution, we compute a 
number of mock observations to investigate how the compli- 
cated nature of structure formation influences observational 
probes of clusters. In all cases we use constant time outputs 
from the simulation, and consider the box in isolation, i.e. we 
do not attempt to make light cones, remap or stack boxes. 
Our simulations contain sufHcient path length to answer the 
questions of interest to us here without needing to employ 
these techniques. Also, we do not model the cluster finding 
process itself. Rather we ask about the measurements that 
could be made once a cluster was correctly identified. 

To identify correlations due to the anisotropic nature 
of the cluster and its environment, we observe each cluster 
along 96 different lines of sight, centering it within the peri- 
odic box. (For intrinsic measurements in ^more sightlines 
are considered, when needed, as described therein.) Each 
line of sight then is used to find galaxy richness, velocity dis- 
persion, lensing and integrated Compton distortion. Our re- 
sulting sample has 83 clusters with Migob > 2 x 10^* h~^MQ 
along almost 8, 000 lines of sight total, and 242 clusters with 
M18O6 > 10" /i"^M0 along ~ 23, 000 lines of sight. 



3.1 Richness 

The easiest property of a cluster to observe is its "richness" , 
or the number of gala:xies it contains. Each halo above any 
infall mass threshold, Mmin, hosts one central subhalo above 
the same threshold mass, and a number of satellite subhalos 
which is (approximately) Poisson distributed about a mean 
(Mhaio/Mi) with Ml ~ 15Mniin. Unfortunately this infor- 
mation is not observationally accessible, and proxies must 
be used. There are numerous definitions of richness in the 
literature, here we consider only two as representative of the 
class. 

The first is th e richness defined by 

lYang. Mo fc van den BoschI l|2008l ). which computes a 
phase space density for each cluster and assigns galaxies 
above a threshold to a cluster candidate. The richness is the 
number of galaxies assigned. Rather than iterate our fit, we 
use the cluster's true mass in the model, but otherwise im- 
plement the method as they describe, including all galaxies 
within the simulation, not just true cluster members, in 
the calculatiorjj. As shown in Fig. [2] the richness measured 
in our simulations is in quite good agreement with that 
inferred from the observations and the richness does show 
strong trends with host halo mass. However it does require 
knowledge of the spectroscopic redshifts of all galaxies. We 
call this quantity phase space richness below. 

A second richness definition counts only those galax;- 
ies within the red sequence and within an aperture, sub- 
tracting an estimate of the contamination. The hope is that 
using only these galaxies reduces the impact of interloper 
galaxies from large line-of-sight distance and blue galaxies 
in front of the cluster, wi thout requiring spectroscop ic red- 
shift information(see, e.g. lCladders fc Yeell200ol . 120051 ). This 
requires us to assign a color to each of our mock galaxies. 
By abundance matching we are able to assign a luminosity 
(o r stellar mass) to all of our subhalos. We use the method 
of lSkibba fc ShethI (|2009l ) to further assign them a color and 
we include them in the projected red sequence based upon 
their distance from the true cluster redshift (this method has 
only been calibrated at z ~ 0.1 so we do not assign colors 
when considering higher z). Further details are given in Ap- 
pendix[B| The richness includes galaxies brighter than 0.4 L^ 
(i.e. Aft -1-1) with the background subtraction computed pre- 
cisely using the periodic simulation volume. The transverse 
apertur e is set following the conventi on used in the m axBCG 
catalog l|Koester et al.ll2007l : see also lHigh et alll2O10l ): a first 
estimate of the richness is obtained within a 1 h~^Mpc trans- 
verse radius and this richness is used to estimate 7?2006 which 
is then used for a final richness estimate. Our final richness- 
mass relation (not shown) is in good agreement with the 
scaling relation found in observed clusters. Note that this 
procedure has the unwelcome property of increasing scatter 
due to filament-based projection effects. Should the initial 
estimate be high due to projected galaxies the aperture will 
be set too large and include even more galaxies. We noted 
a large increase in richness scatter at fixed mass using this 
procedure relative to when we use the true radii. 



•^ We use H/c rather than Hq/c as the prefactor in their equation 
7. 
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3.2 Galaxy kinematics 

Modeling of galaxy kinematics in clusters remains a major 
tool in determining their properties. Since we are able to 
resolve and track the subhalos which would host galaxies 
within our simulation, we are in a good position to study 
how their velocity structure depends upon and correlates 
with cluster properties and the larger environment. As this 
capability is new in terms of mock velocity observations, we 
shall develop it in some detail in the next two sections. The 
intrinsic properties of the velocity field, including velocity 
bias, were discussed in H2.2I Anisotropy and substructure 
are discussed in iJJl Our modeling of interloper rejection and 
dispersion estimation is the subject of ij5] 



3.3 Lensing 

The distribution of luass can be probed by the distortion of 
background galaxy shapes due to the gravitational deflec- 
tion of light by the potentials of massive halos. This signal 
is sensitive to the projecte d mass along the line-of-sight, S , 
weighted by a kernel (e.g. iHoekstra. Yee &: Gladderd 12002 : 
lRefregieiil2003l . for recent reviews) . The lensing kernel varies 
only very slowly with distance, so all of the matter in and 
around the cluster receives similar weight. If we assume the 
source and lens redshift (distributions) are known, lensing 
measures the projected mass. We do not attempt to model 
the full light cone here, rather we make the approximation 
that mass far from the cluster is uncorrelated with the clus- 
ter and contributes only a "noise" while mass close to the 
cluster receives the same weight as the cluster itself. We ig- 
nore the noise term (and any additional noise from the finite 
number of source galaxies or observational non-idealities) 
and approximate a lensing observation as a measurement of 
the projected mass, apodized with a Welch kernel 



W{z) 



2Z 



(1) 



where — ^ib. 



< Z < ^I/box is the line-of-sight coordinate. 
The window vanishes for \Z\ > |l/box. We model all lensing 
observations as applying to E(i?) determined in this manner 
along any line-of-sight, using the periodicity of the box to 
place the lensed object at the center of the box. 

A detailed study of lensing projection effects 
is not the focus of this pa per. It has been dis- 
cussed in detail previously (IReblinskv fc BartelmannI 



19991: iMetzler. White fc LokenI I2OOII: 



Hoekstra 



2OOII : 



de Putter fc Whitell2005l : iMeneghetti et al.ll201oi rin order 



to gauge the approximate size of the effect and its degree of 
correlation with other measures of cluster size we simply fit 
a singular-isothermal-sphere model (p oc r~^) to the lensing 
signal. In order to remove much of the uncorrelated signal 
we use the (^ statistic 

({Ro; Ri,R2) oc {E(ii < Ro)} - {^{Ri <R< R2)) (2) 

where the constant of proportionality depends on the source 
and lens redshift distributions which we shall assume known 
for simplicity. For our singular-isothermal-sphere this gives 



Coc 



O-le 



1 



1 



(3) 



G \Ro R1+R2, 
as a function of Ro at fixed Ri and R2 which we use to fit for 



cicns. The results are quite stable to variations in the Ri, for 
our fiducial results we fit Rq in the range O.lrigot to rigoft 
with i?i = risoi, and R2 = 1.25r-igob. Qualitatively similar 
results are obtained if we fit directly to the projected mass, 
S (_R), or use a different prof i le suc h as the broken power-law 
of lNavarro. Frenk fc White! l| 19971 ). 



3.4 Sunyaev-Zel'dovich effect 

Another method for finding and weighing galaxy clusters is 
to study the distortion they introduce in the cosmic mi- 
crowave background (CMB). The "hot" electrons in the 
intra-cluster medium can scatter the "cold" CMB photons 
to higher energy, distorting t he spectrum in predictable ways 
ISunvaev fc Zerdovichlll972l ). The surface brightness of the 
Compton distortion is independent of distance, and the in- 
tegrated signal is proportional to the total thermal energy of 
the gas, making this a powerful means for finding and char- 
acterizing clusters. The insensitivity to distance, however, 
means that SZ experiments must also contend with projec- 
tion effects. Assuming a self-similar cluster, the Compton 
distortion scales as M^'"^, so lower mass halos contribute 
fractionally less than they do to a lensing or galaxy mea- 
sure, but the relatively lower resolution of the observations 

exacerbates the prob lem. 

In earlier work (|Cohn fc White! 120091 ') we investigated 
optical and SZ methods for finding clusters and found that 
the scatter from the cluster candidates in these two methods 
was correlated. We continue that investigation here, using a 
simple model of the Compton distortion appropriate to low- 
resolution observations (such as provided by the South Pole 
Telescopqj or Atacama Cosmology Telescopqj). We assign 
to each dark matter particle in the simulation a "mean" 
temperature based on the velocity dispersion of its parent 
halo, and compute the total Compton distortion as a sum 
of the mass times the temperature in cylinders, apodizing 
the signal as above (Eq. [T|. This misses contributions to the 
temperature from e.g. shocks, small-scale structure in the 
intra-cluster gas and the run of temperature with radius. For 
partially resolved cluster observations however the low-order 
properties of the maps so obtained are in reasonable agree- 
ment with hydrodynamic simulations which in clude these 
effects (e.g. IWhite. Hernguist fc Springel! 120021 ) . and serve 
to illustrate our main points. We shall use as our observable 
the integrated Compton y-parameter within a disk of ra- 
dius rigot, as this is a more stable quantity than e.g. central 
decrement. 



4 HALO INTRINSIC PROPERTIES 

4.1 Halos 

We begin by considering the intrinsic properties of our 
massive halos and their subhalo populations, absent any 
line-of-sight projection or misidentifications. It is well 
known t hat the 3D density profile of massive h a los is 
triaxial ifThomas fc Couchmanl Il992l : IWarren et al Il992l : 
Ijing fc Suto! 120021 ). with the major axis approximately 



^ http://pole.uchicago.edu 

^ http://www.physics.princeton.edu/act 
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siderably more scatter. The galaxy line-of-sight dispersion 
for any viewing angle, n, is simply ai^^ — h^ ■ a ■ h, where 



0.4 0.6 

Xg/X^ or Xg/X^ 



Figure 5. The distribution of A2/A1 (dotted) and A3/A1 (dashed) 
for all halos more massive than 2 X 10^^ /i~^Mq at 2; = 0.1. Thick 
lines show the results for subhalos within the FoF halos and thin 
lines for subhalos within riggt of the most bound particle in each 
group. Note that the eigenvalues, A, of the velocity anisotropy 
tensor scale as A ~ cr"^ . 



S<J«b) 



Figure 6. The distribution of '5o"jJjj,/(o'jJj^) for all halos more mas- 
sive than 2 X 10^* h~^MQ at 2 = 0. As in Fig.[5]thick lines show 
the results for subhalos within the FoF halos and thin lines for 
subhalos within r200 of the most bound particle in each group. 



twice as long as the minor axes which are approximately 
equal in size. When spherically averaged the density pro- 
files of the 'relaxed' halos re semble a broken power-law 
(JNavajro. Frenk fc Whitelll997l ') with the inner regions form- 
ing early and then remaining approximately constant as sub- 
sequently accreted dark matter is kept away from the center 
by the angular momentum barrier. Our subhalos follow a 
profile similar to that of the dark matter, though shallower 
in the central regions. 



4.2 Velocity ellipsoid 

Although the 3D, dark matter velocit y dispersion within 
^2000 is well correlated with M200C (e.g. lEvrard et al]|2008l ) 
and the galaxies show little velocity bias compared to the 
dark matter, the line-of-sight velocity dispersions show con- 



ffij is the anisotropy tensor, a^j = {{v- 



,), averaged 



over subhalos in the host h alo. As has be e n noted before in 
the d ark matter particles (|Tormenl 119971 : iKasun fc EvrardI 
[2OO5I), and seen here for the galaxy subhalos as well, the 
velocity tensor, like the moment of inertia tensor, is quite 
anisotropic (see also Fig. [ij . Not surprisingly the principal 
axes of the two are quite well aligned, with a typical mis- 
alignment angle of ~ 20 — 30° . 

If we order the eigenvalues of a^j as Ai > A2 > A3 then 
for uniformly chosen h the distribution of ai^^ has a peak at 
A2, a mean at |(Ai -I- A2 + A3) and a width 



(<5o-ios) = T^ [Ai + A2 + A3 



A1A2 - A2A3 - A3A1] (4) 



For our sample, the distribution of eigenvalues for all 
halos above 2 x lO^*/i~^M0 at z = is shown in Fig. [S] 
where we see typical values for A3/A1 and A2/A1 are 0.3 and 
0.6 respectively. For the more massive subhalos the spread 
in eigenvalues is slightly larger than for a random subset of 
the mass but they become increasingly comparable as we 
move down the subhalo mass function. 

We found that the distribution of measured velocity dis- 
persion for any cluster, along 10,000 randomly selected lines 
of sight, tended to be significantly non-Gaussian. The dis- 
tribution of Saf^^ from Eq. |4]is shown for our massive halos 
in Fig. \6\ we see that Sai^^/ai^^ is peaked at 20-30 per cent. 
If one assumes M oc af^^ this gives an inferred mass error of 
nearly 40 per cent. This suggests that, even absent any inter- 
lopers, velocity bias, or observational non-idealities, velocity 
dispersion mass estimators will work better in an ensemble 
sense than for any individual cluster. 

As an example of an ensemble measurement. Fig. [7] 
shows the distribution of velocity dispersions measured from 
our simulation for 10 clusters with 3 x 10^* h~^MQ < 
Alisob < 3.5 X 10^* h~^ Mq. The solid histogram is the com- 
posite of a values for all the clusters, using the member 
galaxies only and projecting along 10,000 lines of sight for 
each cluster, while the line shows a Gaussian fit. (The dot- 
ted line and histogram are for the distribution which results 
when the same clusters are observed along 96 lines of sight, 
including interlopers and a culling method discussed below 
in p. 

This intrinsic line of sight scatter also suggests that if 
the goal is to determine the mass distribution there is an 
upper limit to the number of galaxy redshifts per cluster 
it is desirable to obtain: there is little to be gained by re- 
ducing sources of error in trios significantly below the dis- 
persion above. Figure [S] which shows the velocity disper- 
sion as a function of the number of subhalos used (added 
in order of decreasing luminosity), gives an illustration of 
this. Only subhalos which are within the friends-of-friends 
halo are included. All of the measures converge to a stable 
value for large numbers of subhalos, but the value depends 
significantly on the chosen line-of-sight. We find the num- 
ber of subhalos at which the asymptotic limit is reached, 
and whether that approach is from above or below, depends 
upon the cluster under consideration, but the results are 
generally stable once 50 subhalos are included (Fig. [9]). 

Fig. [To] shows some typical line-of-sight velocity his- 
tograms and phase-space distributions for a massive 
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Figure 10. Lino of sight liistograms for the same massive (Afigo;, = 2.4 X 10^ H^^Mq) cluster along three different lines of sight. 
(Left) velocity histograms for all galaxies within riggs in plane of sky (dotted) and true cluster members (solid). The smooth curves are 
Gaussians with the same area under the curve as the line of sight velocity distributions, and with dispersion fit to the core of the line of 
sight velocity distribution to guide the eye. (Center) observed phase space diagram, transverse radius vs. redshift space position (including 
peculiar velocities). Solid (blue) triangles are true cluster members, open squares are interlopers from halos with mass < 0.2M^.l^g and 
squares with (red) crosses inside are interlopers from halos with mass > 0.2M(.ius, i.e. massive neighbors. (Right) true phase space 
diagram, transverse radius vs. true line-of-sight position (absent peculiar velocities). The bottom row is for a sightline where there are 
only 3 nearby galaxies (out of 87) and all measures are within 50 per cent of the true mass, the middle row has many nearby galaxies, 
but none from massive halos and overpredicts the mass in both Compton distortion and weak lensing by at least 50 per cent, and the 
top row has nearby galaxies from a nearby massive halo, about 10h~^Mpc in the foreground, and overpredicts the mass from lensing, 
velocity dispersions and Compton distortion. None of these lines of sight have appreciable substructure using the Dressler-Shechtman 
test described in i]4.3l 



(Misoi, = 2.4 X 10^" /i"^Mq) cluster viewed down three dif- 
ferent lines-of-sight, with the sohd hnes being the histogram 
for the galaxies found within rigob, i-e. the "true" cluster 
members. There is a large variation in the velocity disper- 
sion profiles, even when only true members are included. 
The interloper structure seen will be discussed in iJS] 



4.3 Substructure 

Our massive halos contain significant substructure in both 
physical and velocity space, which is frequently attributed 
to the active merger histories of massive halos. We find that 
groups of subhalos which fall in together remain highly cor- 
related for significant spans of time (several Gyr). In many 
respects these past accretion or merger events are still "on- 
going", in that the 3D density field has multiple distinct 
maxima and one can still see kinematically distinct groups 



of subhalos which were part of the merger partner and fell 
in together at that time. One example is given in Fig. Illl 
which shows the tracks of a small subset of the subhalos 
in a massive cluster and illustrates the long-term coherence 
of the group of subhalos even as it moves within the virial 
radius of its host. Though they are not highlighted in the 
figure, there are several other major groupings of subhalos 
that were accreted together and have survived for some time. 
Each has had a complex merger history but shows a long 
term persistence even though it is now well inside the for- 
mal virial radius of the host halo. It is an over-simplification 
to assume that when a halo falls into a larger neighbor and 
becomes a satellite that all its satellites become associated 
with the larger halo and evolve independently. 

Not only do halos survive as distinct entities but groups 
of subhalos do as well. In fact, for massive halos, 30% of 
the subhalos in our sample are satellites when they fall in. 
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Figure 11. The persistence of substructure. The tracks show a small subset of the subhalos which fell into a large halo in the simulation 
as part of a large group at 2 ~ 0.3, corresponding to the last major merger for this halo. Each panel is 6/i~^Mpc on a side, centered on 
the 2 = position of the most bound particle in the halo, and the dashed line marks the virial radius (r200c) of the main halo. To avoid 
crowding only a small fraction of the subhalos, the main progenitors with Mjnf > lO^'^ /i~^Mq, are plotted. Subhalos which merge with 
these halos before 2 = 0, and any subhalos which were not part of this group back at 2 ~ 0.3 are omitted. Note the coherence of this 
"group of subhalos" for 3/i~^Gyr. 
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Figure 7. The composite histogram, normalized to unit area, 
of velocity dispersions for 10 clusters with 3 X 10^'^ h~^ Mq < 
^iSOb ^ 3.5 X 10-"^^ ^^^Mq, in two cases: with only the true 
group members included (solid, black line, with 10,000 lines of 
sight per cluster) and with the full line-of-sight with interlopers 
removed as in i}5] (red, dotted line, with 96 lines of sight per 
cluster). The curves indicate Gaussian fits. The vertical line at 
<T ~ 620 km/s is the average of the three-dimensional velocity 
dispersions for these clusters. The line-of-sight measurement with 
interloper r ejection has a lower av erage velocity dispersion (as 
seen also in Ivan Haarlem et al.l ll997: Bi viano et al. 2006). Even 
this narrow range of halo masses exhibits a wide range of a, as 
discussed in the text, with Gaussian fits of width 100 km/s. The 
skewness in the distribution is not present for all samples of this 
size. 



One consequence of this has been seen in other contexts: 
satellites often merge with other satellites (rather than the 
central galaxy of their current halo), and the satellite they 
merge with is often t he old central of the halo they were 
in prior to the merger ( Angulo et alll2009l : ISimha et alll2009l : 
I Wetzel. Cohn fc White! 120091 ). Visually we also saw corre- 
lated velocities between nearby satellites with different orig- 



Figure 8. The line-of-sight velocity dispersion vs. the number of 
subhalos included (ordered by Mjnf, i.e. luminosity, from highest 
to lowest) for a halo with 2 ~ 0.1 mass 4 X 10^^ h~^MQ. We 
include only subhalos which lie within the friends-of-friends halo, 
excluding any interlopers. The solid line shows the isotropic dis- 
persion (i.e. u^jj/^/S) while the dotted and dashed lines show the 
dispersion along the eigenvector directions corresponding to the 
smallest and largest eigenvalues of cr? . 



inating groups, presumably due to infall along a common 
filament. This long-term dynamical coherence also indicates 
that care should be taken when assuming relative velocities 
between galaxies are a substantial fraction of the host virial 
velocity, e.g. when estimating merger rates or impulses. 

All of our simulated clusters have very obvious sub- 
structure. We have implemented several standard tests for 
dynamical substructure, which have been frequently applied 
to simulations and observed clusters in the literature, to 
see how well they find the substructure we know to be 
th ere. An excellent revi ew of these methods can be found 
in IPinknev et al. | (119961): s ome m ore recent statistics are 
summarized in IHou et al.l (|2009l ). We focus on the three 
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Figure 9. As in Fig. [8] the line-of-siglit velocity dispersion vs. tiie 
number of subhalos included (ordered by M-^^f from highest to 
lowest), but now for a range of halos and focusing on the region 
-'^sub "£ 50- As before, we include only subhalos which lie within 
the friends-of-friends halo, excluding any interlopers, and plot 
the line-of-sight dispersion normalized by the dispersion of the 
dark matter within the friends-of-friends halo. Note that each 
line converges to a stable result for large numbers of subhalos but 
the value depends upon the cluster and line-of-sight chosen, as 
does whether the approach is from above or below. 



dimensional test of iDressler fc Shectman (1988), and the 
one dimensi onal tests of Kol moRorov and Arnold- Darling 
described in IHou et al.l (120091) and refer the reader to pa- 
pers there for details. The IDressler fc ShectmanI ( 198S ) test 
has been applied to s imulations previously (e.g. ICenlll997l : 
iKnebe fc Mulleril2000l ). but usually to a large subset of the 
dark matter particles in the cluster rather than subhalos. 
Because we use subhalos identified with galaxies within the 
simulation, our methods are a further step in quantifying the 
difficulty of identifying cluster substructures observationally. 
When dark matter particles are used the large number avail- 
able allows them to trace the cluster structure more faith- 
fully than the observationally available galaxies, but using 
random dark matter particles as sample galaxies misses the 
dynamical coherence of groups of substructures that natu- 
rally arises in hierarchical structure formation scenarios. 

We find many of our clusters show signs of substruc- 
ture along some lines of sight: surprisingly, we find none of 
the three substructure indicators is well correlated with the 
time since last major merger, and the values of the indica- 
tors are very dependent on viewing angle for a given cluster 
even before we consider interlopers due to projectiorjj. If the 
substructure is well separated along the line-of-sight, from 
the bulk of the galaxies, then it is caught by each of the indi- 
cators. Otherwise it can be missed. As an example we pick 
one cluster, containing 57 subhalos brighter than M* -I- 1. 
When viewed down the z-axis it is not flagged as having sub- 



^ This is in contrast to ICenI 1 119971) who only found a large 
amount of substructure after interlopers were included; one pos- 
sible source of the difference is our use of galaxy subhalos 
rather than dark matter particles . Our results lend support to 
ICrone. Evrard fc Richstonj 1 11999) who found such tests perform 
relatively poorly as cosmological indicators. 



structure by the tests we consider: the probability-to-exceed 
for the Dressler- Shectman t est is 54 per cent, D* — 1.18 
and A'* = 1.68 ||Hou et al.1 (2009) suggest that D* in ex- 
cess of 1.2 or A^* in excess of 1.9 indicate the presence of 
substructure). However, viewing this same cluster down the 
x-Skxis the probability-to-exceed for Dressler-Shectman is 3 
per cent, D* = 2.76 and A^* = 10. There are many sim- 
ilar examples. In some cases the Dressler-Schechtman test 
flags substructure where the other tests do not, while for 
others the situation is reversed. Sometimes the Dressler- 
Shectman A statistic is high, but similar or larger values 
arc obtai ned when shufflin g the velocities (as described in 
[Dressier fc Shectmanll 19881 ) leading to a higher probability- 
to-exceed or a lower significance detection of substructure. 
In these cases the prevalence of substructures in the host 
halo means that the "shuffled" statistics are not faithfully 
representing the "no substructure" scenario, leading one to 
erroneously assume the observed value of the statistic is con- 
sistent with no substructure. 

These results suggest caution when interpreting lack of 
observed substructure in the galaxy distribution as evidence 
for a dynamically relaxed, steady-state object (e.g. justify- 
ing the use of the virial theorem or Jeans analysis without 
the time derivative). A cluster can be undergoing substan- 
tial mass accretion, i.e. be far from steady state, and still 
not be seen to have substructure along some lines of sight. 
The viewing angle dependence also complicates inferences 
about incidence of dynamical evolution of cluster galaxies 
from observed interactions of subclusters within the clus- 
ter identified through subs tructure finding techniques. There 
are s ome indications (e.g. iBiviano et al.lll996l : lAdami et al.l 
120051 ) that more sophisticated substructure finding tech- 
niques could yield more complete information in the limit 
of hundreds of spectroscopic redshifts per cluster. Since we 
found earlier that the dynamics of the subhalos approached 
that of the dark matter particles as we progressed down 
the subhalo mass function, we expect very minor differences 
with earlier work when hundreds of subhalos are included. 



5 INTERLOPERS 

The intrinsic line-of sight scatter in velocity dispersion dis- 
cussed above ( H4.2p is a "best case" estimate, where we have 
perfect identification of cluster members. In observations, 
an extra complication is provided by "interloper" galaxies 
which lie close to the cluster in the plane of the sky and 
in velocity but which nevertheless are really members of a 
different halo. Restricting samples to elliptical galaxies or 
matching on photometric properties can help, but does not 
solve the problem completely. Conversely, measurement er- 
rors in the velocities (which we do not model) can exacerbate 
the interloper problem - though it is expected that typical 
velocity errors will have only a small effect on estimates 
(JBiviano et al.ll2006l ). 

Returning to Fig. 1101 we now turn attention to the inter- 
lopers in the line-of-sight velocity histograms. In the middle 
and right columns we see the galaxies in phase space and 
physical space with true members represented by filled tri- 
angles, interlopers represented by open boxes and interlopers 
from massive halos (with mass > 0.2Mcius) represented by 
open boxes with crosses in them. Depending upon the line of 
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sight, the same cluster can have (top to bottom): contribu- 
tions from nearby massive halos, contributions only from less 
massive halos, or few interlopers. In these three instances the 
inferred red-sequence richness is very high, high and close to 
the mean for a cluster of this mass. It is typical that a sin- 
gle halo exhibits each of these characteristics when viewed 
from different directions, as a large fraction of halos have a 
massive neighbor. For the ensemble of velocity dispersions 
in our sample, while the distributions can often be well fit by 
a Gaussian profile, a non-trivial fraction of the lines-of-sight 
lead to "fiat topped" , skew or bi-modal distributions or dis- 
tributions that can be fit with Gaussian s plus an excess in 
the w ings (as seen in observations, e.g. M ilvang- Jensen et al.l 
120081 1^ In some cases interlopers cause an excess in the cen- 
ter of the velocity distribution. 



5.1 Interloper removal 

Several techniques have been devised to identify and reject 
these interlopers. Since in the simulations we know which 
objects are true cluster members we can apply these algo- 
rithms to our samples to see how they perform. Such in- 
vest igations have been done before (e.g. iPerea et al.l 199d: 
den Hartog & Katgcrt 1999: van Haarlem et al. 1997 : Cen 
1997; Di aferio et all Il999l : iLokas et all l2006l : IWoitak et a] 
2007. . ,2009|) but typically using randomly selected dark mat- 



ter particles rather than subhalos. By using subhalos we keep 
any correlations between subhalo positions and large-scale 
environment or between subhalos which fell in together as 
part of a larger structure. Including the interlopers in our 
mock observations and then using observational techniques 
to attempt to remove them is also important for estimating 
the scatter induced by the cosmic web, a main concern of 
this paper. 

One of the simplest, and most widely used, inter- 
loper rejection me thods is 3 cr clipping (|Yahil fc Vidalll 19771 : 
iLokas et al.l 120061 : it has been applied in some large sur- 
veys and individual objects of special interest such as 
Hallidav et al.ll2004l: ICal fc Lubinll2004l: iBecker et al.ll2007l: 



Milvang- Jensen et al.ll2008l : lKurk et afcoogl '), which uses the 
fact that line-of-sight velocities of cluster members are close 
to Gaussian and iteratively excludes all galaxies 3 a away 
from the mean. Given enough galaxies o ne can perfo r m this 
procedure in bins of transverse radius. IPerea et al.l (|l990l ) 
developed a method based on removing galaxies whose ab- 
sence causes the largest change in a mass estimator while 
Diaferio & Gellor (1997) proposed the use of caustics and 
Prada et al.l (|2003i ') proposed an escape velocity cut. Vari- 



ous authors argue that the 'gaps' in the velocity distribu- 
tion give a better rejection criterion (IZabludoff et all Il990l . 
iKatgert et"aLlll996llOwers. Couch fc Nulsenll2009l ). Methods 
which use both projected c oordinates and velocit y infor - 
mation were introduced by Iden Hartog fc KatgertI l|l996r ) ; 
iFadda et all l| 19961 ). 

We tested a number of interloper rejection algorithms. 
Here we focus on an example of the more complex methods 



which us es projected coordinates and velocity information 
(see e.g. 



den Hartog fc KatgertI 1 19961 : iBiviano et all l2006l : 



^[ojtaLey l20d, and referenc^ s^i^rre m), and its com ^ 
son to simple 3 a clippi ng. Such co mparisons have been per- 
formed before ([van Haarlem et al.i 1997 : Woitak et al 20090 
but usually using randomly selected particles from lower res- 
olution simulations rather than subhalos. Again this means 
that correlations between observed galaxy properties are 
more faithfully tracked in our case. 

Our implementation is as follows. We assume that a 
trial center of the cluster has been determined. All galaxies 
with velocities within 3, 000 km/s of the central velocity and 
projected radius smaller t han risob are then selected. The 
weighted gap method (e.g. iGirardi et al.lll993l ) is then used 
to further remove galaxies along the line-of-sight. Specifi- 
cally, gaps are defined as gi — Vi+i — Vi for the sorted ve- 
locities and weights as Wi = i{N — i), for i = 1, . . . , A^ — 1 
for N galaxies. Galaxies to one side of a weighted gap larger 
than 3 are removed, where the weighted gap is defined as 



V9- w 



with the midmean of the weighted gaps defined as 



3iV/4 



MM{y^g-lF) ^jrY^Va 



(5) 



(6) 



JV/4 



The motivation for such a cut lies in the expectation that 
the velocity dispersion is Gaussian, and the assumption 
that when the galaxy distribution departs from a Gaussian 
"core" it is no longer associated with the halo of interest. 
W e use a modification to the weighted gap as described 
bv lOwers. Couch fc NulsenI (|2009 ), where it is applied sep- 
arately in annuli of 50 galaxies eaclij. Otherwise, we found 
the weighted gap tended to throw out too many galaxies. 

Now we use the projected distribution to define a fur- 
ther, transverse radius dependent, velocity cut and itera- 
tively remove galaxies beyond this cut. The cut depends on 
the (projected) harmonic radius, defined as 

A-(A-l) ^, ^'^ ' ^^^ 



Rh 



where the sum is over galaxies out to radius R. If the velocity 
dispersion of the currently remaining galaxies is a, we define 
a circular velocity as 



2/c>^ o /D^2 Rh(R) 
v^(H) = in a(H) — ^—^ 
R 



(8) 



Typically Rh/R ~ 1/2 and decreases from center to edge, 
as the profile becomes steepei[j. From Vc we further define a 
"freefall" velocity, Vff — \p2vc- Then a galaxy at (projected) 
radius R is an interloper if it is further from the cluster 
center than 



c(_R) = max [?;//(_R) cos&,?;c(-R) sinS] 



(9) 



where Q is the angle between radial vector and the line-of- 
sight and the maximization is done over the (real-space) 



^ Stacking the velocity histograms for all lines of sight corre- 
sponding to some richness or mass range also yields an approxi- 
mate Gaussia n, with excess in the far wings, which also has been 
observed (e.g. Ivan der Marel et al.ll2000.) . 



* The results were stable for between 25 and 50 galaxies per 

annulus, for definiteness we used 50. 

^ For example, if the density profile is a power-law, S(-R) oc K~^ , 



the ratio RuIR = (3 - 2p)/(4 - 2p). 
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Figure 12. The distribution of mass predictions for the ensemble 
of sightlines for our massive sample using 3 a clipping, (Tkin and 
Mjj-in as described in the text. The areas under the curves differ 
because extreme outliers extend beyond the x-axis range shown. 
The "virial" mass, Mjtin, is the best tracer of the true mass. 



line-of-sight position of the galaxy (the idea being that any 
observed galaxy can either be on a circular or radial or- 
bit, with different boundedness criteria). Finally the velocity 
dispersion of the remaining gal axies is estimated u sing the 
bi- weight estimator described in I Beers et al.l (|l99d ). and we 
shall refer to this as (Jkin- 

Though there is a wide diversity in cluster behaviors, 
the method of interloper rejection is more important than 
the precise dispersion estimator. The use of the on-sky po- 
sitions to define a transverse radius dependent velocity cut 
performs slightly better than a fixed threshold, but in both 
cases the threshold varies significantly from step to step and 
can remove true cluster members while keeping actual inter- 
lopers. 

Fig. ll2l compares results from 3(t clipping and our more 
complex, phase-space based interloper rejection scheme for 
a sample of massive clusters. Except for extreme outliers, 
where the phase-space method performs slightly better, the 
distributions are quite close and noticeably non-Gaussian. 
These results are quite insensitive to cluster mass. 

The more complex algorithm can fail in some instances. 
We found the most sensitive step was the weighted gap mea- 
surement, which can fail when the interloper structure along 
the line of sight is too close to define a clear gap in the ve- 
locity histogram. This is the case, for example, when two 
clusters are fairly close in one line of sight or when we see 
a chain of small substructures close together, as one would 
expect when looking down a filament. In these cases the 
weights given to the gaps do not work properly and gaps 
are not properly detected. 

Once we include line-of-sight projections and the need 
for interloper removal, there is some gain to having more 
galaxies in order to better estimate the cluster potential 
(Fig. I13|) but the intrinsic scatter due to the velocity ellip- 
soid remains a fundamental limitation (e.g. Fig. [7|. As the 
number of galaxies with which we estimate the dispersion 
increases the estimate becomes stable but is still a relatively 
poor estimate of the angle-averaged dispersion. 




100 200 

Number of subhalos 



300 



Figure 13. As in Figs. [8] and [9] the line-of-sight velocity dis- 
persion, (in units of the isotropically averaged dispersion for all 
subhalos) as a function of number of subhalos used, but now in- 
cluding non- member subhalos and using the interloper rejection 
scheme described in the text. The x-axis gives the number of sub- 
halos used to compute the dispersion, after interloper removal. 
We have plotted 3 lines-of-sight for halos with Miggt in the range 
(0.6- 1.0) X IQIS/i-IMq. 



5.2 Degradation due to interlopers 

Applying this technique to our clusters, along 96 lines of 
sight each, allows us to find and compare the distributions 
of o values resulting from interlopers (and their rejection 
as described above) and the distribution due to intrinsic 
line of sight variation. The dotted line in Fig. [7] shows the 
distribution of ajjin for the 96 sightlines for 10 clusters in 
mass range (3 — 3.5) x 10^''/i~^Mq. The standard devia- 
tion in (Tkin is about lOOkm/s when only cluster member 
galaxies are included and is approximately 10 per cent larger 
when including (and then rejecting) interlopera^^l. There is 
a s light downwar d shift in the mean crjjin, as was also seen 
by iBiviano et al.l (12000) . These trends are reproduced for 
higher and lower mass clusters. 

The line-of-sight dispersion is only one piece of informa- 
tion available to estimate masses, and other information can 
be introduced. For example, one can include the compact- 
ness of the cluster, estimated from the projected member po- 
sitions, or go further including corrections for surface terms 
and orbital anisotropy, and beyond. It is not our intention 
here to model each of the (complex) methods which have 
been presented in the literature, but we do note that the 
next-to-simplest suggested mass estimator is proportional 
to cTkin-Rfe where Rh is the (projected) harmonic radius of 
Eq. ((TJ. We shall denote this Mkin- Formally this estima- 
tor would be valid only for spherical, isolated systems with 
galaxies tracing mass, but due to a correlation between er- 
rors in Rh and o we find Afkin produces a tighter, less skewed 



^^ An study of cluster velocity dispersions llWeinmann et al2009t) 
applies the bi-weight estimator to a subset of clusters in the Mil- 
lennium simulation and also finds a large scatter (their Figure 
1). 
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estimate of Misob than the pure dispersion based measures 
(see Fig. I12p even in our more complex systems. Although 
there is some variation from cluster to cluster, most often 
a higher-than-average a is compensated by a lower-than- 
average Rh- The compensation is not perfect, but it reduces 
the significance of the fluctuation, leading to more lines-of- 
sight within the core of the distribution an d fewer strong 
outliers. (Similar cancellations were seen by iBiviano et al.l 
1(2003) when comparing observations with and without in- 
terlopers. They found the tendency of interlopers to bias a 
low was (over-) compensated by their tendency to bias Rh 
upwards.) For this reason it is Mitin which we correlate with 

other quantities in the f ollowing sectio n. 

We also note that iBiviano et al.l (|2006r ) found a cor- 
relation between catastrophic outliers in the mass-akin or 
mass-i\fi:in relations and substructure. Using the Dressler- 
Shectman test on the galaxies which were selected using our 
interloper rejection procedure we found that 39 per cent of 
the lines of sight had substructure {P < 0.05) and of these 
only 10 per cent had > 50 per cent deviations between Mkin 
and true mass. By contrast, of the lines-of-sight with > 50 
per cent deviation in Mkin, 52 per cent had substructure to 
be compared to 40 per cent for lines-of-sight where Mkin is 
a reliable estimate of mass. Thus outliers in Mkin do tend to 
have detectable substructure more often than non-outliers, 
but substructure doesn't necessarily lead to Mkin outliers 
and thus is not a reliable flag for it. 



6 MULTIWAVELENGTH MEASURES: 
CORRELATED SCATTER 

Although clusters obey tight scaling relations, we expect 
a large scatter in individual measures of cluster mass/size. 
Clusters are generally triaxial and highly biased. They are 
formed and fed at the intersection of a network of fllaments 
in atypical and anisotropic cosmological environments. Their 
mass accretion is punctuated by a series of mergers with 
other massive objects. With the growing number of multi- 
wavelength, large area surveys underway multiple measure- 
ments of large numbers of clusters are possible, and there is 
a hope that different methods can cross-check each other. 

As scatter is often caused by the cosmic web, mea- 
sures sensitive to the web will have correlations induced in 
their scatter. Consider an idealized model, in which galaxies 
flow into the cluster from a small number of (approximately 
straight) fllaments, retaining memory of this due to incom- 
plete virialization. In such a scenario, we might expect the 
line-of-sight component of the velocities is biased for the 
same viewing angles as those for the projected mass, pro- 
jected pressure and projected galaxy number density. This 
would lead to correlations in the mass inferred by richness, 
dynamics, Compton distortion and lensing. In this section 
we consider the relative strengths of these correlations and 
their causes in the local cluster environment. 
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Figure 14. Correlated scatter for an individual cluster between 
velocity dispersion (Tkin and lensing dispersion cr^. The circles 
correspond to the three lines of sight shown in Fig. llOl for the same 
cluster. Of the three, the lowest lensing dispersion corresponds 
to the bottom panel there (with hardly any nearby interlopers), 
the largest lensing dispersion (and lowest Ukin) corresponds to 
the center panel with many low mass neighbors, and the largest 
velocity dispersion corresponds to the top panel, with a high mass 
nearby halo. The isotropic velocity dispersion for this cluster is 
690 km/s. 



Taking the median covariances from all the massive clus- 
ters the largest covariances were between red galaxy richness 
and all other quantities, followed by covariances of veloc- 
ity dispersion with the other probes. In terms of scatter of 
Mprod/Mtruc — 1, the tightest correlations with mass in our 
measures was for Compton distortion and phase space rich- 
ness, followed by weak lensing, and then red galaxy richness 
and velocity dispersioq^. We show in Fig. [TJ] the measures 
of lensing dispersion and velocity dispersion along all 96 lines 
of sight for the same cluster as in Fig. 1101 a correlation can 
be seen. 

The correlations between individual measurements were 
usually below 0.5, indicating that each additional obser- 
vation is adding signiflcant new information about the 
mass/size of the cluster, with the lower dispersion measures 
giving the tightest constraints. It should be borne in mind 
though that the distributions were far from Gaussian, and 
the mass function steeply falling, so errors should be inter- 
preted with care. 

To compare the ensemble of multiwavelength measure- 
ments for all the lines of sights for all the clusters, we 
flt mean power-law relations between the observables and 
mass to convert each multiwavelength measure to a com- 
mon system (the "predicted" mass). Then we divided up 
the sightlines into "good" and "bad" based on whether 
i(Mtruc — Mpred)l/Mtruc > 0.5 for at least 2 independent 
observablea^^l. The bad sightlines comprised 8(11) per cent 
of the 96 sightlines per cluster with M > 2(1) x 10^* H'^Mq. 



6.1 Basic results 

Considering richness, dynamics, lensing and Compton dis- 
tortion, we found that the degree of correlation between dif- 
ferent measures of cluster size varied dramatically from clus- 
ter to cluster, each sampling a different, local, cosmic web. 



^^ The scatter in Compton distortion and projected mass could 
be increased by material outside our box, which we have not mod- 
eled. 

^^ For this analysis we discarded lines of sight for clusters where 
a higher mass cluster was found along the line of sight within 
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Over half of the massive (M > 2 x 10^* /I'^M©) clusters had 
at least one sightline where at least 3 measures were off (the 
most common sources of mass errors were in red galaxy rich- 
ness, velocity dispersion and lensing), and more than half of 
the bad sightlines were due to 18/83 of the clusters, each 
with 10 or more bad sightlines. As ~ 1/3 of the sightlines 
had at least one quantity giving more than a 50 per cent 
error from the mean mass relation, the reduction in error 
to 8 per cent of the sightlines when using at least two mea- 
sures is a significant improvement. Using only galaxies with 
L > 0.41/* (rather than L > 0.21/*) resulted in a small 
increase in the number of bad sightlines (from 8 to 11 per 
cent). 

Scatter in the observables can arise from several vi- 
olations of the idealized, isolated, relaxed, spherical halo 
assumption. The halo itself can be irregular (e.g. recently 
merged), or regular but anisotropic. Nearby correlated ma- 
terial can be seen in projection or uncorrelated material at 
large distances can be projected onto the cluster position^^l. 
We have discussed halo state and anisotropy above. Here 
our interest is in the comparison of nearby structure and 
substructure for bad and good lines of sight. We consid- 
ered a cluster to have nearby massive structure if at least 
three L > 0.4 L* galaxies from another halo(s) of mass 
> 0.2 Mciuater Were present within 3a"kin in redshift space 
and within rigoi, in the plane of the sky, and to have nearby 
less massive structure if nearby massive structure wasn't 
present as above and at least eight L > 0.4 L* galaxies from 
halos with mass < 0.2 i\fciustGr were within the same region. 
For halos above 2 x 10^^ h~^ Mq we found 21 per cent of the 
bad sightlines had nearby massive structure and 49 per cent 
had nearby less massive structure, compared to 2 and 25 per 
cent respectively for the good sightlines. The bad sightlines 
were 10 times more likely to have a nearby massive halo and 
almost twice as likely to have nearby less massive halos. A 
larger fraction (52 per cent) of the bad sightlines have cluster 
substructure (Dressler-Shechtman probability less than 0.05 
as described in >j4.3p . compared to 38 per cent of the good 
sightlines. All together, 80 per cent of the bad sightlines 
had one of these three indicators (nearby massive structure, 
or numerous less massive structure, or substructure) com- 
pared to 51 per cent of the good sightlines. These numbers 
changed very little when we lowered the mass threshold to 

However, although the likelihood of substructure, 
nearby massive or less massive halos increased for bad sight- 
lines, the majority of sightlines with substructure, nearby 
massive or less massive halos were not bad sightlines. Of the 
39 per cent of the lines of sight which have substructure de- 
tected, only 10 per cent are bad lines of sight. Similarly of 
the 4 per cent of sightlines with nearby massive structure, 
59 per cent are good, and 41 per cent are bad. For the 26 
per cent with nearby less massive structure, 86 per cent are 
good and 14 per cent are bad. 



''1801) on the plane of the sky. This takes out 70 out of our 7,968 

massive sightlines. 

^^ We have tended to ignore this contribution here, as our box is 

too small to fairly sample it and it has been extensively studied 

elsewhere. 



6.2 Implications for stacking 

As is well known, correlated errors must be handled with 
care. For example, if the source of scatter is correlated, 
two non-independent measures can agree and both be in 
error. These subtleties must also be bor ne in mind then 
one starts to stack measurements (see also lNord et al.ll2008l : 
iRvkoff et alll2008l : IStanek et al]|2010l ). 

Stacking can be done in several ways. Multiple measure- 
ments can be made for a set of objects and then the mean 
of one of the measurements can be taken holding another 
fixed. Alternatively, there may be insufficient signal to mea- 
sure all of the properties on individual objects, so they are 
first stacked on one property and the second is measured 
on the stack. In this case one has the additional freedom to 
either scale the size of any aperture with the first property 
or use a fixed metric aperture. Finally, one can relate two 
properties while holding a third property fixed either by av- 
eraging individual methods or measuring the properties on 
an average (e.g. fix richness and then measure Compton dis- 
tortion and lensing). 

It is known that a scatter between two variables, x and 
y, implies that conditional probabilities must be interpreted 
with care. For example, there is scatter between halo mass, 
M, and richness, A'^, which in the mean obey a relation 
IgAf = a + blgN. However the mean (log) mass of halos 
in a bin A'' « Ao is not a + blgNo. Since there are typically 
many more low mass halos than high mass halos, it is likely 
that a high richness object is in reality a low mass object 
with artificially high richness for its mass rather than an 
intrinsically massive object of mean (or low) richness. The 
degree of such bias depends on the amount of scatter and 
the slope of the halo mass function, which becomes steeper 
at both high mass and high redshift. If one estimates the 
mass using a method (e.g. lensing) which itself has scatter, 
then the degree of error also depends on how correlated the 
scatter between these methods is and the relative sizes of 
the scatter. 

For example, if scatter in richness were driven entirely 
by line-of-sight projection of nearby structures, and if it was 
identical to the amount of mass projected onto the "lens" , 
then the error in the mass estimated by lensing would cancel 
the bias described above. However, if one measures an extra 
property, e.g. X-ray flux, which is immune to the projection, 
the mass-flux relation one infers from the stack would be 
biased to high masses at flxed flux. This would lead to an 
incorrect relation between mass and X-ray flux. For detailed 
formulae in a simple analytical mod el see Appendix [Cl (and 
iRvkoff et alll2008l : iNord et al.|[20oi ). 

In general we expect the situation to be slightly more 
complicated in reality (or simulations) than the log-normal, 
analytical model suggests. We saw in the last section that 
a small number of halos are responsible for a fair fraction 
of the outliers, and that the distribution of errors has non- 
Gaussian tails. While the general trends are not altered by 
these issues, they serve to alter the quantitative predictions. 

In fact none of these complications lead to large correc- 
tions to our measured scaling relations. All of the quantities 
show strong trends with mass, and all of them have relatively 
large scatter. The distribution of points in the observable- 
observable plane is therefore determined by the range of 
masses being selected much more than subtle correlations 
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between the observables. This serves to make any biases rel- 
atively small. While we fully expect biases to be present, 
given our limited simulation volume we are not able to mea- 
sure them reliably. 

Some examples serve to illustrate the main points. 
We choose as a fiducial sample all lines-of-sight with red- 
sequence richness 29 < iV < 30 at 2; ~ 0.1, containing 271 
lines-of-sight from 104 halos. We choose red-sequence rich- 
ness as it is one of the more common quantities to stack 
on. As expected, the mean mass of these clusters is skewed 
low by the steeply falling mass function. The line-of-sight 
weighted mean mass is Migoi, — 2.2 x lO'^ h~ Mq. A ran- 
domly chosen sample of halos with the same mass distri- 
bution has a line-of-sight weighted mean richness of 22-24 
(with fluctuations depending on how the sampling is done), 
i.e. it is ~ 25 per cent poorer than the input sample. The 
mean (and median) values of the velocity dispersion, pro- 
jected mass and Compton distortion of this random sample 
are also "low" . How do these mean values compare to those 
of the sample selected on richness? In fact they are quite 
similar, differing by < 10 per cent in the mean. This is be- 
cause there is a large degree of scatter between red sequence 
richness and mass and a strong correlation of all measures 
with mass, making selecting on richness approximately the 
same as randomly sampling halos with a specific mass dis- 
tribution. The joint distributions of e.g. Compton distor- 
tion and velocity dispersion or projected mass and Compton 
distortion also turn out to be very similar in the random- 
and richness-selected samples. There is a tendency for the 
richness-selected sample to have more outliers in velocity 
dispersion using 3a clipping than the random sample, but 
otherwise the joint distributions are almost indistinguish- 
able. 

We find similar results by stacking on e.g. velocity dis- 
persion. The distribution in e.g. the ^ — C plane is the same 
for the velocity dispersion selected sample as in a sample of 
the same mass distribution. 

The largest impact of stacking on e.g. richness for our 
sample then is not the degree to which the scatters in indi- 
vidual measurements are correlated on an object-by-object 
basis but the fact that the stack contains clusters of a wide 
range of masses/sizes. If the measurement being performed 
is a non-linear function of the mass, care must be taken in 
interpreting the meaning of the averaged quantity. 



7 HIGHER REDSHIFT 

Unfortunately our simulation volume is too small to make 
robust statements about increasingly rare objects at high 
re dshift, but in this sec tion we note some trends. According 
to iMoster et al.l (|2010l ) the lower mass limit of our subha- 
los corresponds to lower stellar-mass subhalos at higher z, 
with the limit dropping from 3 x 10^ h~^MQ at z ~ to 
2 X lO^ft-^M© at z ~ 0.5 to 1 x 10^/i"^Mq at z ~ 1. 
There is little evolution in the characteristic stellar mass in 
the mass function over the same range, so we probe further 
below the break in the mass function at higher z. Since, on 
average, satellite subhalos fell into their host more recently 
at higher z, the satellite fraction is smaller for samples se- 
lected above a given halo mas s or stellar mass (see discussion 
in e.g. IWetzel fc W"hit3l2010l ). 



While we have 83 halos with A/ > 2 x 10^* h'^ Mq at 
z ~ 0.1, this drops to 28 at z ~ 0.5 and only 5 at 2 ~ 1, 
making us increasingly susceptible to "outliers". The num- 
ber of massive neighbors per very massive halo increases as 
we go to higher z, due to the steepening of the mass function 
at the high-mass end. Though the statistics are poor, there 
is evidence that the velocity bias of th e subhalos is decreas - 
ing with increasing redshift (see also lEvrard et al.l [2008J) . 
The distribution of the eigenvalues of the velocity ellipsoid 
is very similar to that at z ~ 0.1 (shown in Fig. [5]), again 
leading to large changes in line-of-sight velocity dispersion 
with viewing angle. 

At 2 ~ 0.5 we found again that Afkin oc Rhcr^ was 
more tightly correlated with halo mass than a^, as it was 
at z ~ 0.1. The more complex, phase-space interloper re- 
jection method continued to perform better than pure 3 a 
clipping. In fact the trends of errors and correlations be- 
tween mass and phase space richness, Compton distortion, 
projected mass and velocity dispersion were unchanged. The 
phase-space richness and Compton distortion had the least 
dispersion, followed by projected mass and then galaxy kine- 
matic^ij. The fraction of bad sightlines does not change sub- 
stantially going from z ~ 0.1 to z ~ 0.5, however the fraction 
of these bad sightlines with many interlopers from low-mass 
halos decreases. As expected from the increasing halos bi- 
ases at higher redshift, the distribution of number of halos 
around the massive clusters tended to have a higher mean at 
higher redshift. The substructure fraction between 2 = 0.1 
and 2 = 0.5 was close to unchanged. 



8 CONCLUSIONS 

Advances in observational capabilities and a new genera- 
tion of wide-field surveys have led to an explosion in multi- 
wavelength samples of galaxy clusters. By studying a cluster 
in many different wavebands, and from many different ap- 
proaches, we can obtain complementary information about 
the physical state of the clusters and mitigate the system- 
atic errors in any single measurement. Combining different 
measurements of cluster properties has to be done carefully 
however, because the environment in which clusters form 
leads to features which can be correlated across methods. 
As the correlation is not perfect, such a combination will 
provide improvements over any individual method if done 
correctly. 

In this paper we have used high resolution N-body sim- 
ulations of a cosmological volume to study how the large- 
scale environment of clusters leads to correlated scatter in 
measures of cluster size, specifically those based upon rich- 
ness, Compton distortion, lensing or velocity dispersions. 
Our simulation has enough force and mass resolution to 
track the subhalos which are expected to host galaxies, al- 
lowing us to study dynamical probes of the cluster with real- 
istic samples incorporating a hierarchy of substructures and 
retaining correlations between subhalo positions, velocities 
and environment. For this reason we paid particular atten- 
tion to dynamical probes of cluster size. 



^■^ We did not consider red galaxy ri chness at higher redsh ift as 
the method we used to assign color ISkibba fc ShethI 120091) was 
only calibrated by observations for lower redshifts. 
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As might be expected in hierarchical structure forma- 
tion scenarios, groups of subhalos retain their identity for 
long periods within larger host halos. This leads to a "lack of 
virialization" which implies that substructures can thus be- 
have quite coherently in phase space. The highly anisotropic 
nature of infall into massive clusters, and their triaxiality, 
means that line-of-sight galaxy velocity dispersions (or virial 
masses) for any individual halo show large variance depend- 
ing on viewing angle. This suggests that dispersion-based 
mass estimators will work better in an ensemble sense than 
for any individual cluster and that obtaining more than tens 
of redshifts for any given cluster will not reduce the inferred 
mass error. We discussed the effect interloper galaxies, and 
their removal, has on kinematic measurements and com- 
pared different schemes for interloper removal. Results were 
presented both for individual clusters and for a "stacked" 
ensemble cluster. 

All of our simulated clusters contain highly evident sub- 
structure, with groups of subhalos which fall in together 
moving in a coherent fashion for several Gyr. However stan- 
dard substructure indicators frequently miss this substruc- 
ture, and often give very different answers for a single object 
viewed from different directions. These results suggest cau- 
tion when interpreting lack of observed substructure in the 
galaxy distribution as evidence for a dynamically relaxed, 
steady-state object (e.g. justifying the use of the virial the- 
orem or Jeans analysis without the time derivative). A clus- 
ter can be undergoing substantial mass accretion, i.e. be far 
from steady state, and still not be seen to have substructure 
along some lines of sight. The viewing angle dependence also 
complicates inferences about incidence of dynamical evolu- 
tion of cluster galaxies from observed interactions of sub- 
clusters within the cluster identified through substructure 
finding techniques. 

Finally we note that many observational probes of clus- 
ters suffer from projection efi'ects, and that these are exacer- 
bated by the dense, active and anisotropic environments sur- 
rounding these massive objects. We found increased nearby 
massive and less massive halos, and substructure, when two 
of our measures (richness, lensing, Compton distortion and 
velocity dispersions) simultaneously had large outliers in 
predicted mass. The converse was not always true, scatter 
in environment or the measurement of substructure did not 
necessarily imply large outliers. 

Since the orientation of the velocity ellipsoid is corre- 
lated with the large-scale structure, velocity outliers also 
correlate with projection induced outliers. For many cases 
the same structure causes scatter in different observations: 
such scatters can be substantially correlated and this cor- 
relation needs to be properly incorporated when combining 
measurements. 
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APPENDIX A: FINDING SUBHALOS 

In our past work w e have used the Subfind algorithm 
ISpringel et al.l I2OOII ) to find subhalos. However we have 
found that a phase-space based approach performs better 
at tracking the subhalos in our most massive hosts and 
for this reason we have switched to this new schem e here . 
In particular we follow iDiemand. Kuhlen fc Madaul pOOd ) 
and implement a phase-space friends-of- friends finder. De- 
tailed experimentation, including a one-to-one comparison 
of the new finder with the results of SubBnd, suggest that 
choosing the configuration-space linking length to be 0.078 
of the mean interparticle spacing and the velocity link- 
ing length to be e~^ ~ 0.368 of the halo (ID) velocity 
dispersion gives a good sub halo catalog. As discussed in 
iDiemand. Kuhlen fc Madaul (2006) the results are stable to 
modest changes in these parameters. The most massive sub- 
halos are the same for both finders, but the lower mass struc- 
tures which pass close to the center of the halo are more 
robustly tracked in the phase-space method than with Sub- 
find. We keep all 6D FoF halos which contain more than 20 
particles. For technical, book-keeping reasons if fewer than 
two subhalos (i.e. a central and a satellite) are found in any 
host halo we slowly increase the linking lengths in that halo 
until one or two subhalos are found. This ensures that there 
are no low-mass halos which have no subhalos, simplifying 
the book-keeping in the tracking scheme. This affects only 
the very low mass halos which are not used in this paper. 



APPENDIX B: THE RED SEQUENCE 

It has become common to use the tight red sequence of 
galaxies found in clusters in order to isolate putative cluster 
members from chance alignments along the line-of-sight dur- 
ing cluster detection. The evolution of the red sequence with 
redshift means that choosing red galaxies within a certain 
color cut also tends to give galaxies at a certain redshift. 
Because color-based cluster finders have become so popular, 
we have included a toy-model of a color-based richness in 
our mocks. There are two steps, first to assign colors to the 
mock galaxies and second ask how the observed properties 
depend on redshift/distance. We take each of these in turn. 



Bl Color assignment 

We first put colors into o ur 2: ~ 0.1 box using the method 
of iSkibba fc ShethI (|2009l ). Their approach has red and blue 
galaxy populations being drawn from two different popula- 
tions (each with an Mr-dependent mean and scatter), where 
the probability of a galaxy belonging to either population 
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IgMinf Mr Acd 



11.50 -19.1 0.48 0.41 0.62 

12.00 -20.2 0.56 0.50 0.69 

12.50 -20.9 0.60 0.56 0.76 

13.00 -21.4 0.64 0.60 0.83 

13.50 -21.8 0.67 0.63 0.91 

Table Bl. Magnitudes and red fractions as a function of infall 
mass (in H-'^Mq) from lSkibba fc ShethI l|2009l '). /red is the frac- 
tion of all galaxies which are red, while Pcen and Pgat are the 
probabilities that a central or satellite galaxy of that infall mass 
is red. 



depending upon Air and whether it is a central or satellite 
galaxy. 

We associate r-band magnitude with infall mass by 
abundance matching, ignoring any scatter in the Mr — Mini 
relation for simplicity. Skibba & Sheth (20031) approximate 
the probability of a satellite to be red as 



PsatyMr 



{g - r|Mr)sat - (g - r\Mr)hiuc 

{g - r|Mr)rod - {g- r\Mr)hluc 



(Bl) 



where 



(g-r|M,.)Bat = 0.83 - 0.08(M,. -f 20) 
(g-r|Af,.)rod = 0.93 - 0.03(M,. -h 20) (B2) 

{g-r\Mr)Muc = 0.62-0.11(M^-h20) 



and find an overall red fraction 



frcd{Mr) ~ 0.54 - 0.07(Af,. + 20) 



(B3) 



Given Paat(Mr), /red and /sat one can solve for Peon (Mr), see 
Table IBll One then takes every galaxy, satellite or central, 
and randomly decides whether it is red or blue. If needed 
its actual color can be taken from the Gaussian fit s to th e 
color- magnitude relations given bv lSkibba fc ShethI l|2009r ). 

B2 Evolution with redshift 

The fact that the observed colors of galaxies evolve with 
distance means that a tight sequence in color (e.g. the 
red sequence) will shift out of any thin color slice as the 
galaxies shift in distance. Thus cuts in color can be used 
to isolate galaxies within a small range of distances (e.g. 
iGladders fc Yed |2000| . l2005h . Modeling the evolution of 
galaxy colors ab initio is notoriously difficult, but a hybrid 
method based on stellar population synthesis models can iso- 
late the main features for our purposes of making "pseudo" 
light cones. 

We simplify our problem by assuming that blue galax- 
ies can be distinguished from red at any distance and we 
need only consider the evolution of the red galaxies. We 
make the further simplification that all of the red galaxies 
are evolving passively, with star formation having ceased 
at some high redshift (e.g. z > 2). Using the stellar pop- 
ulation models described in ( Conrov. Gunn fc White! 2009 : 



IConrov. White, fc GunnI I2OI0I : IConrov. fc GunnI I2OI0I ) we 
find that the g — r color of a passively evolving popula- 
tion scales with redshift as d{g — r)/dz ~ 1-2, with the 
precise slope depending on the star formation history as- 
sumed. Similarly, the absolute r-band magnitude scales as 
dMr/dz ~ 0.1-1. For our cosmology dx/dz = 2900/i"^Mpc, 
where x is the (comoving) line-of-sight distance and a linear 



approximation is acceptable over the limited extent of our 
simulation. 

Given a color cut of a certain width the speed at which 
the color of the red sequence "ridgeline" changes with z de- 
fines the range of distance over which galaxies will be se- 
lecteqj. We encode this information as the probability that 
a red galaxy at a given distance wil l fall into the fiducial red 
sequence cut (c.f. iGohn et al.ll2007l V 

If the width and peak of the red sequence were indepen- 
dent of the magnitude this transformation would be triv- 
ial: for a Gaussian color distribution and a fixed width Ac 
the interloper probability is the difference of two error func- 
tions with width Ac/(dc/dx)- However the non-zero depen- 
dence on Mr slightly complicates the calculation. To include 
this complication we first calculate the corresponding mag- 
nitude for every red galaxy as if it were actually at redshift 
0.1-1- Sz, corresponding to its offset from the box midplane, 
but dimmed (or brightened) by the change in distance. We 
then calculate what Mr and thus g ~ r distribution will 
result for this dimmed galajcy as it evolved to z = 0.1, as- 
suming the linear evolution defined above. The evolved Mr 
at 2; = 0.1 has a 2 = 0.1 c olor (g — r) di stribution well fit by 
a Gaussian distribution flSkibba fc She th 2009). The color 
is evolved back to z = 0.1 + Sz to give the observed color 
distribution for the galaxy at z = 0.1 + Sz with magnitude 
Air estimated as if it were at z = 0.1. The interloper fraction 
of galaxies is the integral of the observed distribution within 
the cut defining the red sequence. 

If we make our red sequence selection have g — r width 
0.05 we find the dispersion in distance runs between 50 and 
100 h~^Mpc, depending on stellar population model, galaxy 
magnitude etc. To be conservative, in the sense of reducing 
line-of-sight projection, we take the lower end of the range 
and assume a red galaxy in the foreground or background 
of the cluster of interest is included within the red sequence 
with a Gaussian probability of width 50/i~^Mpc. As with 
the other measures, we apodize the selection to ensure the 
probability is zero at the limits of the simulation. 



APPENDIX C: MEASUREMENTS WITH 
CORRELATED SCATTER AND STACKING 

It is helpful to consider a simple analytic model which il- 
lustrates the effect of correlated scatter on different observ- 
ables. We will consider the case of two measurements, mi 
and 7712, of some quantity m. For example, one could con- 
sider mi to be richness-inferred (log) mass and m2 to be 
lensing- inferred (log) mass with m the "true" (log) mass. 
We imagine that P{m\,m2\m) is a bi-variate Gaussian with 
means jj,i{m) and covariance 



Gov [mi , m,2] 



/5l2CriO"2 



/0l2O"l(T2 



(CI) 



and for simplicity we assume that ai and pi2 are independent 
of m and that ^i = ai + bim. We shall write the probability 
that a cluster has mass m, the mass function, as P(m) and 

^^ Even though our box is at a single output time, we assume 
line-of-sight distance corresponds to redshift. The evolution of 
the large-scale structure over the relevant time interval is so small 
that it may be safely neglected. 
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for convenience/simplicity take it to be a power-law in mass 
or P{m) oc exp[— am]. 

As is well known, when a^ > and a 7^ the mean 
"true" mass of clusters with measured mass nii is biased. 
Since P{m\m\) oc P{m\\m)P{m) we have 



(m\m\ — nil ^ ) 



mf = - ai 



b'i 



o-i 



(C2) 



Similarly, if we consider the case where mi is known (e.g. 
selected) to be mj'"^ the conditional distribution of 7712 is 
also a Gaussian with 0-2 = (1 — Pi2)c"2 a-nd mean 



m2\mi — nil 



= M2 + 



^^2 / Ob 

— P12 mi 

(71 V 



Ml 



(C3) 



These facts allow us to consider computing m2 as a 
function of m by binning on mi and averaging the measures 
of m2 in each bin. It is easy to show that if ai =0 one 
simply obtains {m,2) = 02 + fe2m = 02 + (&2/fel)(n^l''^ — tsi) 
as desired. However if the ai > we have more terms. By 
writing P{m\,m2,m) = P(m,2|m,, 77ii)P(m|mi)P(mi) and 
recalling that the ^i are linear in m, one finds Eq. (|C3|) with 
^i (in) in place of ^i (m) leading to 



/™ \ I ^2 / obs 

(m2) = a2 + — (mi - ai 



a 
b[ 



&2 2 
Pl20"iCr2 — T-'^l 
Ol 



(C4) 
Consider the case ai — and &i = 1, i.e. the measure- 
ments give unbiased estimates of the (log) mass for halos of 
a fixed mass: (m,i|m,) = m. Stated another way, the average 
of each mass estimate in narrow bins of halo mass returns 
the correct mass and in such bins each measurement cor- 
rectly predicts the mass which would be estimated by every 
other measurement. If one could bin in mass, it would be 
straightforward to estimate the mean observable-mass rela- 
tion. 

The situation changes when we bin not by mass but 
by observable, e.g. richness. In this case, even though the 
richness-based mass estimator is unbiased, the (unobserv- 
able) true mean halo mass in the bin is biased (low) because 
the falling mass function makes it more likely that a halo of 
richness A^ is a low mass halo which fluctuated up than a 
high mass halo which fluctuated down in richness. Similarly, 
the mean mass estimated from a second observable in that 
bin differs by a(pi2cricr2 — crj) from the first observable defin- 



ing the bin, as in Eqn. IC4I That observable-mass relation is 
thus also biased. Note that the bias disappears if p = 1 and 
(71 = (72, in which case the errors conspire to cancel exactly 
because fiuctuations in observable one directly imply com- 
pensating fluctuations in observable two. For example, a low 
mass halo which had an abnormally high richness would be 
counted in the richness bin even when it "should not be". 
But its lensing signal would also be abnormally high by just 
the right amount to give the right mean mass in the richness 
bin. 

Finally, if we estimate a third observable in the same 
bins of e.g. richness it will be biased by a different amount: 
Q(pi30"ia3 — (7i). The relation between observables 2 and 
3, when binned on 1, is thus biased in both coordinates. 
Though we have not considered it in our toy model, this 
bias may well be mass dependent. 

In these examples the bias is due entirely to the falling 
mass function, because we assumed explicitly that at = 
and hi — 1, i.e. the measurements give unbiased estimates 



of the mass. Using the results above, the general case can 
be considered but we gain no further insight from doing so. 
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